Goto

Collaborating Authors

 variational automatic curriculum learning


Supplementary material for Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems

Neural Information Processing Systems

All the source code can be found at our project website https://sites.google.com/view/ In order to prove Theorem 1, we introduce the following lemma, which uses Assumption 1. Lemma 1. The proof is largely based on [2]. Let Hd = H Hbe a vector-valued RKHS, and F[f] be a functional of f. Pure Task Expansion Results on MPE: VACL contains entity progression in the result of Figure 1. To specifically study the performance of task expansion, we exclude entity progression module from VACL and compare with baselines in Simple-Spread with n= 4 and Push-Ball with n= 2. For a fair comparison, we also provide additional experiments to combine GoalGAN and AMIGo with the initial knowledge of easy tasks.


Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems

Neural Information Processing Systems

We introduce an automatic curriculum algorithm, Variational Automatic Curriculum Learning (VACL), for solving challenging goal-conditioned cooperative multi-agent reinforcement learning problems. We motivate our curriculum learning paradigm through a variational perspective, where the learning objective can be decomposed into two terms: task learning on the current curriculum, and curriculum update to a new task distribution. Local optimization over the second term suggests that the curriculum should gradually expand the training tasks from easy to hard. Our VACL algorithm implements this variational paradigm with two practical components, task expansion and entity curriculum, which produces a series of training tasks over both the task configurations as well as the number of entities in the task. Experiment results show that VACL solves a collection of sparse-reward problems with a large number of agents. Particularly, using a single desktop machine, VACL achieves 98% coverage rate with 100 agents in the simple-spread benchmark and reproduces the ramp-use behavior originally shown in OpenAI's hide-and-seek project.


Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems

Neural Information Processing Systems

We introduce an automatic curriculum algorithm, Variational Automatic Curriculum Learning (VACL), for solving challenging goal-conditioned cooperative multi-agent reinforcement learning problems. We motivate our curriculum learning paradigm through a variational perspective, where the learning objective can be decomposed into two terms: task learning on the current curriculum, and curriculum update to a new task distribution. Local optimization over the second term suggests that the curriculum should gradually expand the training tasks from easy to hard. Our VACL algorithm implements this variational paradigm with two practical components, task expansion and entity curriculum, which produces a series of training tasks over both the task configurations as well as the number of entities in the task. Experiment results show that VACL solves a collection of sparse-reward problems with a large number of agents.